Basic procedure

Each trial has an input phase (left) and a test phase (right). During the input phase the speaker directly names objects from a particular category. In the test phase, new objects from the same category are shown but the speaker ambiguously asks for “it”. Main question: Do children pick the object from the same category.

Setup for child experiment. Input trial on the left and test trial on the right. Children heard pre-recorded utterances.

Setup for child experiment. Input trial on the left and test trial on the right. Children heard pre-recorded utterances.

General note on inference

I ran Bayesian GLMMs fit via brms. We include random effects for subject and item (speaker) and random slopes for condition whenever applicable. I used model comparisons as the main way of doing inference about age and condition effects. I used the following indicators (following Richard McElreath’s book).

After finding the “winning model”, I use Bayes Factors to compare the winning model to alternative models without the predictors in question and also look at the predictors directly within the model to check the direction (positive vs. negative) and whether the CI overlaps with 0.

General note on adult data

We have adult data (MTurk) for Experiment 1 and 3. For Experiment 2, we have data that tests the effect of amount of input but with different conditions. Adult experiments were not pre-registered, mainly because we used them to find the best procedure. If we were to include adults in the paper, we might want to consider running them again with a pre-registration. Here, they are only included in the plots.

Experiment 1: Simple inference

Children received 6 input trials before the test trials. Main question was whether they pick the object from the same category.

Participants

We tested 2,3 and 4yo. We added 2yo later and also increased the sample size for them because we expected a smaller effect. Children received 4 trials in a single condition.

age_group n
2 30
3 21
4 20

Results

Comparison to chance

We bin data by age and use a one-sample Bayesian t-test to compare performance to chance. The table below shows Bayes Factors (BF) for each age group.

age_group mean BF
2 0.417 0.592
3 0.595 90.772
4 0.550 10.392

3 and 4yo seem to make the basic inference while the evidence is not that strong for 2yo. Because of that, we did not run 2yo in subsequent experiments.

Effect of age

Here we use a Bayesian GLMM to look at the effect of age. For inference we compare it to a model without age as a predictor.

The table below shows WAIC scores and weights for each model (RE = random effects, same for each model). The column BF_age_model shows the Bayes Factor for the model comparison between the full model (with age in that case) and each reduced model (here: without age)

model WAIC SE weight BF_age_model
model_w_age: age_num + RE 386.92 7.67 0.6 -
null_model: 1 + RE 387.71 7.09 0.4 1.66

The model comparison favors the model with age as predictor. This is broadly in line with the analysis binned by age. However, the effect of age is not too strong: The BF in favor of the model with age is not that high and the predictor for age in the model overlaps with 0 (see plot below).

Posterior distribution for model fixed effects. Point indicates posterior mean, thick line shows 50%CI and thin line shows 95% CI.

Posterior distribution for model fixed effects. Point indicates posterior mean, thick line shows 50%CI and thin line shows 95% CI.

Experiment 2: Variable input

Here we varied the number of input children received before each test trial. They either heard the speaker name six objects from the same category (high input) or just one (low input). Main question was whether their performance drops with lower input.

Participants

The sample size in this study was rather small because age effects were not our major focus. We were mainly interested in the effect of condition. Children received again 4 trials, 2 in each condition.

age_group n
3 18
4 15

Results

Effect of condition

The table below shows WAIC scores and weights for each model. The column BF_int_model shows the Bayes Factor for the model comparison between the full model (interaction model) and each reduced model (here: with main effect of condition or null model without condition).

model WAIC SE weight BF_int_model
model_w_interaction: condition * age_num + RE 177.93 11.07 0.22 -
model_w_condition: condition + age_num + RE 177.16 9.98 0.33 4.35
null_model: age_num + RE 176.53 9.27 0.45 10.54

The model comparison favors the null model with only age as predictor. (Note, however, that the Bayes Factor favors the interaction model). Condition seems not to affect children’s performance. Overall, adding a second condition seems to have made the general task harder for younger children. This is also reflected in the reliably positive effect of age in the null model (plotted below).

Posterior distribution for model fixed effects. Point indicates posterior mean, thick line shows 50%CI and thin line shows 95% CI.

Posterior distribution for model fixed effects. Point indicates posterior mean, thick line shows 50%CI and thin line shows 95% CI.

Experiment 3: Speaker change

Here we tested whether children make speaker specific inferences. That is, whether they see the topic of a conversation as specific to a particular speaker. Children received 6 input trials with one speaker, then the speaker left the scene and either the same or a different speaker returned. At test, the speaker always asked for “it”.

Participants

Here we again tested a larger sample to also look at age effects. Children received 4 trials, 2 in each condition.

age_group n
3 30
4 30

Results

Effect of condition

The table below shows WAIC scores and weights for each model. The column BF_int_model shows the Bayes Factor for the model comparison between the full model (interaction model) and each reduced model (here: with main effect of condition or null model without condition).

model WAIC SE weight BF_int_model
model_w_interaction: condition * age_num + RE 325.18 10.44 0.60 -
model_w_condition: condition + age_num + RE 327.67 9.49 0.17 27.4
null_model: age_num + RE 327.09 9.02 0.23 79.09

The model comparison favors the interaction model. Bayes Factors also suggest that this model fits the data better compared to the other models. When looking at the model predictors, we see a positive interaction effect, mirroring what we see in the graphs above, namely that younger children do not distinguish between the two conditions, but older children do.

Posterior distribution for model fixed effects. Point indicates posterior mean, thick line shows 50%CI and thin line shows 95% CI.

Posterior distribution for model fixed effects. Point indicates posterior mean, thick line shows 50%CI and thin line shows 95% CI.

Conclusion

Children make inferences about what the general “topic” of a discourse is and use this to identify the referent of an ambiguous utterance. The amount of input they receive seems to have no direct effect on this inference. Older children treat the discourse topic as something that is specific to a particular speaker.